Dataset statistics
| Number of variables | 18 |
|---|---|
| Number of observations | 23856 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 3.3 MiB |
| Average record size in memory | 144.0 B |
Variable types
| NUM | 16 |
|---|---|
| BOOL | 1 |
| CAT | 1 |
Reproduction
| Analysis started | 2020-06-19 13:27:23.127417 |
|---|---|
| Analysis finished | 2020-06-19 13:28:06.993907 |
| Duration | 43.87 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
X_10 is highly skewed (γ1 = 34.9427132) | Skewed |
INCIDENT_ID has unique values | Unique |
X_1 has 19036 (79.8%) zeros | Zeros |
X_4 has 3335 (14.0%) zeros | Zeros |
X_5 has 4695 (19.7%) zeros | Zeros |
X_7 has 3461 (14.5%) zeros | Zeros |
X_8 has 8774 (36.8%) zeros | Zeros |
X_11 has 2553 (10.7%) zeros | Zeros |
X_14 has 288 (1.2%) zeros | Zeros |
X_15 has 1017 (4.3%) zeros | Zeros |
| Distinct count | 23856 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 186.4 KiB |
| CR_192721 | 1 |
|---|---|
| CR_155199 | 1 |
| CR_25493 | 1 |
| CR_80148 | 1 |
| CR_132739 | 1 |
| Other values (23851) |
| Value | Count | Frequency (%) | |
| CR_192721 | 1 | < 0.1% | |
| CR_155199 | 1 | < 0.1% | |
| CR_25493 | 1 | < 0.1% | |
| CR_80148 | 1 | < 0.1% | |
| CR_132739 | 1 | < 0.1% | |
| CR_42347 | 1 | < 0.1% | |
| CR_168612 | 1 | < 0.1% | |
| CR_46115 | 1 | < 0.1% | |
| CR_71321 | 1 | < 0.1% | |
| CR_94805 | 1 | < 0.1% | |
| Other values (23846) | 23846 | > 99.9% |
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 8.44714118 |
| Min length | 4 |
| Distinct count | 8 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.4837776659959759 |
|---|---|
| Minimum | 0 |
| Maximum | 7 |
| Zeros | 19036 |
| Zeros (%) | 79.8% |
| Memory size | 186.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 3 |
| Maximum | 7 |
| Range | 7 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.439737889 |
|---|---|
| Coefficient of variation (CV) | 2.976032152 |
| Kurtosis | 13.65891063 |
| Mean | 0.483777666 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.789307148 |
| Sum | 11541 |
| Variance | 2.072845188 |
| Value | Count | Frequency (%) | |
| 0 | 19036 | 79.8% | |
| 1 | 3497 | 14.7% | |
| 7 | 876 | 3.7% | |
| 5 | 270 | 1.1% | |
| 3 | 136 | 0.6% | |
| 4 | 26 | 0.1% | |
| 2 | 10 | < 0.1% | |
| 6 | 5 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 19036 | 79.8% | |
| 1 | 3497 | 14.7% | |
| 2 | 10 | < 0.1% | |
| 3 | 136 | 0.6% | |
| 4 | 26 | 0.1% |
| Value | Count | Frequency (%) | |
| 7 | 876 | 3.7% | |
| 6 | 5 | < 0.1% | |
| 5 | 270 | 1.1% | |
| 4 | 26 | 0.1% | |
| 3 | 136 | 0.6% |
X_2
Real number (ℝ≥0)
| Distinct count | 52 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 24.791205566733737 |
|---|---|
| Minimum | 0 |
| Maximum | 52 |
| Zeros | 22 |
| Zeros (%) | 0.1% |
| Memory size | 186.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 7 |
| median | 24 |
| Q3 | 36 |
| 95-th percentile | 49 |
| Maximum | 52 |
| Range | 52 |
| Interquartile range (IQR) | 29 |
Descriptive statistics
| Standard deviation | 15.24023098 |
|---|---|
| Coefficient of variation (CV) | 0.6147434395 |
| Kurtosis | -1.30551524 |
| Mean | 24.79120557 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | -0.0947521072 |
| Sum | 591419 |
| Variance | 232.2646403 |
| Value | Count | Frequency (%) | |
| 4 | 4029 | 16.9% | |
| 36 | 2232 | 9.4% | |
| 33 | 2174 | 9.1% | |
| 24 | 1344 | 5.6% | |
| 21 | 1254 | 5.3% | |
| 37 | 962 | 4.0% | |
| 49 | 927 | 3.9% | |
| 45 | 908 | 3.8% | |
| 3 | 778 | 3.3% | |
| 22 | 672 | 2.8% | |
| Other values (42) | 8576 | 35.9% |
| Value | Count | Frequency (%) | |
| 0 | 22 | 0.1% | |
| 1 | 20 | 0.1% | |
| 2 | 116 | 0.5% | |
| 3 | 778 | 3.3% | |
| 4 | 4029 | 16.9% |
| Value | Count | Frequency (%) | |
| 52 | 19 | 0.1% | |
| 51 | 103 | 0.4% | |
| 50 | 160 | 0.7% | |
| 49 | 927 | 3.9% | |
| 48 | 55 | 0.2% |
| Distinct count | 10 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.276743796109994 |
|---|---|
| Minimum | 0 |
| Maximum | 10 |
| Zeros | 3335 |
| Zeros (%) | 14.0% |
| Memory size | 186.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 4 |
| Q3 | 6 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 2.944672067 |
|---|---|
| Coefficient of variation (CV) | 0.6885313238 |
| Kurtosis | -1.013239087 |
| Mean | 4.276743796 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.1833932631 |
| Sum | 102026 |
| Variance | 8.671093584 |
| Value | Count | Frequency (%) | |
| 6 | 5497 | 23.0% | |
| 2 | 4791 | 20.1% | |
| 0 | 3335 | 14.0% | |
| 7 | 2890 | 12.1% | |
| 4 | 2027 | 8.5% | |
| 3 | 1871 | 7.8% | |
| 9 | 1360 | 5.7% | |
| 10 | 1242 | 5.2% | |
| 1 | 841 | 3.5% | |
| 5 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 3335 | 14.0% | |
| 1 | 841 | 3.5% | |
| 2 | 4791 | 20.1% | |
| 3 | 1871 | 7.8% | |
| 4 | 2027 | 8.5% |
| Value | Count | Frequency (%) | |
| 10 | 1242 | 5.2% | |
| 9 | 1360 | 5.7% | |
| 7 | 2890 | 12.1% | |
| 6 | 5497 | 23.0% | |
| 5 | 2 | < 0.1% |
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.4556086519114686 |
|---|---|
| Minimum | 0 |
| Maximum | 5 |
| Zeros | 4695 |
| Zeros (%) | 19.7% |
| Memory size | 186.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 3 |
| Q3 | 5 |
| 95-th percentile | 5 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 1.963094729 |
|---|---|
| Coefficient of variation (CV) | 0.7994330562 |
| Kurtosis | -1.558871205 |
| Mean | 2.455608652 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.1752310231 |
| Sum | 58581 |
| Variance | 3.853740916 |
| Value | Count | Frequency (%) | |
| 5 | 7368 | 30.9% | |
| 1 | 6818 | 28.6% | |
| 3 | 4973 | 20.8% | |
| 0 | 4695 | 19.7% | |
| 2 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 4695 | 19.7% | |
| 1 | 6818 | 28.6% | |
| 2 | 2 | < 0.1% | |
| 3 | 4973 | 20.8% | |
| 5 | 7368 | 30.9% |
| Value | Count | Frequency (%) | |
| 5 | 7368 | 30.9% | |
| 3 | 4973 | 20.8% | |
| 2 | 2 | < 0.1% | |
| 1 | 6818 | 28.6% | |
| 0 | 4695 | 19.7% |
X_6
Real number (ℝ≥0)
| Distinct count | 19 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.154175050301811 |
|---|---|
| Minimum | 1 |
| Maximum | 19 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 186.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 5 |
| Q3 | 8 |
| 95-th percentile | 15 |
| Maximum | 19 |
| Range | 18 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 4.471756047 |
|---|---|
| Coefficient of variation (CV) | 0.7266215229 |
| Kurtosis | 0.03760850344 |
| Mean | 6.15417505 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0.9608294397 |
| Sum | 146814 |
| Variance | 19.99660214 |
| Value | Count | Frequency (%) | |
| 1 | 3461 | 14.5% | |
| 5 | 2679 | 11.2% | |
| 6 | 2629 | 11.0% | |
| 4 | 2319 | 9.7% | |
| 15 | 2318 | 9.7% | |
| 2 | 2298 | 9.6% | |
| 7 | 2286 | 9.6% | |
| 3 | 1708 | 7.2% | |
| 8 | 1405 | 5.9% | |
| 9 | 1267 | 5.3% | |
| Other values (9) | 1486 | 6.2% |
| Value | Count | Frequency (%) | |
| 1 | 3461 | 14.5% | |
| 2 | 2298 | 9.6% | |
| 3 | 1708 | 7.2% | |
| 4 | 2319 | 9.7% | |
| 5 | 2679 | 11.2% |
| Value | Count | Frequency (%) | |
| 19 | 2 | < 0.1% | |
| 18 | 162 | 0.7% | |
| 17 | 110 | 0.5% | |
| 16 | 620 | 2.6% | |
| 15 | 2318 | 9.7% |
| Distinct count | 19 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.876509054325956 |
|---|---|
| Minimum | 0 |
| Maximum | 18 |
| Zeros | 3461 |
| Zeros (%) | 14.5% |
| Memory size | 186.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 4 |
| Q3 | 7 |
| 95-th percentile | 12 |
| Maximum | 18 |
| Range | 18 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 3.881930665 |
|---|---|
| Coefficient of variation (CV) | 0.7960470538 |
| Kurtosis | 0.493689765 |
| Mean | 4.876509054 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0.7961675929 |
| Sum | 116334 |
| Variance | 15.06938569 |
| Value | Count | Frequency (%) | |
| 0 | 3461 | 14.5% | |
| 6 | 2679 | 11.2% | |
| 4 | 2629 | 11.0% | |
| 2 | 2319 | 9.7% | |
| 10 | 2318 | 9.7% | |
| 7 | 2298 | 9.6% | |
| 1 | 2286 | 9.6% | |
| 5 | 1708 | 7.2% | |
| 3 | 1405 | 5.9% | |
| 8 | 1267 | 5.3% | |
| Other values (9) | 1486 | 6.2% |
| Value | Count | Frequency (%) | |
| 0 | 3461 | 14.5% | |
| 1 | 2286 | 9.6% | |
| 2 | 2319 | 9.7% | |
| 3 | 1405 | 5.9% | |
| 4 | 2629 | 11.0% |
| Value | Count | Frequency (%) | |
| 18 | 139 | 0.6% | |
| 17 | 200 | 0.8% | |
| 16 | 210 | 0.9% | |
| 15 | 25 | 0.1% | |
| 14 | 18 | 0.1% |
| Distinct count | 24 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.9724597585513078 |
|---|---|
| Minimum | 0 |
| Maximum | 99 |
| Zeros | 8774 |
| Zeros (%) | 36.8% |
| Memory size | 186.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 3 |
| Maximum | 99 |
| Range | 99 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.453144468 |
|---|---|
| Coefficient of variation (CV) | 1.494297789 |
| Kurtosis | 952.9615467 |
| Mean | 0.9724597586 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 17.70384903 |
| Sum | 23199 |
| Variance | 2.111628843 |
| Value | Count | Frequency (%) | |
| 1 | 11010 | 46.2% | |
| 0 | 8774 | 36.8% | |
| 2 | 2268 | 9.5% | |
| 3 | 967 | 4.1% | |
| 4 | 404 | 1.7% | |
| 5 | 207 | 0.9% | |
| 6 | 79 | 0.3% | |
| 7 | 33 | 0.1% | |
| 8 | 32 | 0.1% | |
| 10 | 23 | 0.1% | |
| Other values (14) | 59 | 0.2% |
| Value | Count | Frequency (%) | |
| 0 | 8774 | 36.8% | |
| 1 | 11010 | 46.2% | |
| 2 | 2268 | 9.5% | |
| 3 | 967 | 4.1% | |
| 4 | 404 | 1.7% |
| Value | Count | Frequency (%) | |
| 99 | 1 | < 0.1% | |
| 50 | 1 | < 0.1% | |
| 30 | 1 | < 0.1% | |
| 29 | 1 | < 0.1% | |
| 22 | 1 | < 0.1% |
X_9
Real number (ℝ≥0)
| Distinct count | 7 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.924128101945003 |
|---|---|
| Minimum | 0 |
| Maximum | 6 |
| Zeros | 118 |
| Zeros (%) | 0.5% |
| Memory size | 186.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 5 |
| median | 5 |
| Q3 | 6 |
| 95-th percentile | 6 |
| Maximum | 6 |
| Range | 6 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.362624612 |
|---|---|
| Coefficient of variation (CV) | 0.276724038 |
| Kurtosis | 1.28166232 |
| Mean | 4.924128102 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -1.525286754 |
| Sum | 117470 |
| Variance | 1.856745834 |
| Value | Count | Frequency (%) | |
| 5 | 10559 | 44.3% | |
| 6 | 9508 | 39.9% | |
| 2 | 3040 | 12.7% | |
| 3 | 452 | 1.9% | |
| 1 | 175 | 0.7% | |
| 0 | 118 | 0.5% | |
| 4 | 4 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 118 | 0.5% | |
| 1 | 175 | 0.7% | |
| 2 | 3040 | 12.7% | |
| 3 | 452 | 1.9% | |
| 4 | 4 | < 0.1% |
| Value | Count | Frequency (%) | |
| 6 | 9508 | 39.9% | |
| 5 | 10559 | 44.3% | |
| 4 | 4 | < 0.1% | |
| 3 | 452 | 1.9% | |
| 2 | 3040 | 12.7% |
| Distinct count | 24 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.244802146210597 |
|---|---|
| Minimum | 1 |
| Maximum | 90 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 186.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 2 |
| Maximum | 90 |
| Range | 89 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.119300682 |
|---|---|
| Coefficient of variation (CV) | 0.8991795888 |
| Kurtosis | 2190.137157 |
| Mean | 1.244802146 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 34.9427132 |
| Sum | 29696 |
| Variance | 1.252834017 |
| Value | Count | Frequency (%) | |
| 1 | 20198 | 84.7% | |
| 2 | 2695 | 11.3% | |
| 3 | 549 | 2.3% | |
| 4 | 225 | 0.9% | |
| 5 | 71 | 0.3% | |
| 6 | 54 | 0.2% | |
| 8 | 15 | 0.1% | |
| 10 | 14 | 0.1% | |
| 9 | 7 | < 0.1% | |
| 7 | 7 | < 0.1% | |
| Other values (14) | 21 | 0.1% |
| Value | Count | Frequency (%) | |
| 1 | 20198 | 84.7% | |
| 2 | 2695 | 11.3% | |
| 3 | 549 | 2.3% | |
| 4 | 225 | 0.9% | |
| 5 | 71 | 0.3% |
| Value | Count | Frequency (%) | |
| 90 | 1 | < 0.1% | |
| 58 | 1 | < 0.1% | |
| 50 | 1 | < 0.1% | |
| 40 | 1 | < 0.1% | |
| 30 | 1 | < 0.1% |
| Distinct count | 133 |
|---|---|
| Unique (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 206.95451877934272 |
|---|---|
| Minimum | 0 |
| Maximum | 332 |
| Zeros | 2553 |
| Zeros (%) | 10.7% |
| Memory size | 186.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 174 |
| median | 249 |
| Q3 | 249 |
| 95-th percentile | 316 |
| Maximum | 332 |
| Range | 332 |
| Interquartile range (IQR) | 75 |
Descriptive statistics
| Standard deviation | 93.03334801 |
|---|---|
| Coefficient of variation (CV) | 0.4495352339 |
| Kurtosis | 0.1944049511 |
| Mean | 206.9545188 |
| Median Absolute Deviation (MAD) | 67 |
| Skewness | -0.9032002688 |
| Sum | 4937107 |
| Variance | 8655.203842 |
| Value | Count | Frequency (%) | |
| 174 | 7275 | 30.5% | |
| 249 | 6930 | 29.0% | |
| 316 | 4500 | 18.9% | |
| 0 | 2553 | 10.7% | |
| 303 | 438 | 1.8% | |
| 127 | 304 | 1.3% | |
| 74 | 207 | 0.9% | |
| 179 | 206 | 0.9% | |
| 102 | 122 | 0.5% | |
| 263 | 103 | 0.4% | |
| Other values (123) | 1218 | 5.1% |
| Value | Count | Frequency (%) | |
| 0 | 2553 | 10.7% | |
| 1 | 3 | < 0.1% | |
| 6 | 1 | < 0.1% | |
| 11 | 5 | < 0.1% | |
| 12 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 332 | 3 | < 0.1% | |
| 330 | 29 | 0.1% | |
| 329 | 21 | 0.1% | |
| 328 | 79 | 0.3% | |
| 327 | 1 | < 0.1% |
X_13
Real number (ℝ≥0)
| Distinct count | 60 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 85.23738262910798 |
|---|---|
| Minimum | 0 |
| Maximum | 116 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 186.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 18 |
| Q1 | 72 |
| median | 98 |
| Q3 | 103 |
| 95-th percentile | 112 |
| Maximum | 116 |
| Range | 116 |
| Interquartile range (IQR) | 31 |
Descriptive statistics
| Standard deviation | 27.59722639 |
|---|---|
| Coefficient of variation (CV) | 0.3237690499 |
| Kurtosis | 1.093046857 |
| Mean | 85.23738263 |
| Median Absolute Deviation (MAD) | 11 |
| Skewness | -1.388636749 |
| Sum | 2033423 |
| Variance | 761.6069043 |
| Value | Count | Frequency (%) | |
| 103 | 6995 | 29.3% | |
| 72 | 4476 | 18.8% | |
| 92 | 3255 | 13.6% | |
| 112 | 2116 | 8.9% | |
| 98 | 1366 | 5.7% | |
| 18 | 851 | 3.6% | |
| 109 | 537 | 2.3% | |
| 24 | 523 | 2.2% | |
| 12 | 427 | 1.8% | |
| 59 | 348 | 1.5% | |
| Other values (50) | 2962 | 12.4% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 1 | 5 | < 0.1% | |
| 2 | 210 | 0.9% | |
| 7 | 1 | < 0.1% | |
| 8 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 116 | 288 | 1.2% | |
| 115 | 21 | 0.1% | |
| 114 | 16 | 0.1% | |
| 113 | 225 | 0.9% | |
| 112 | 2116 | 8.9% |
| Distinct count | 62 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 72.67429577464789 |
|---|---|
| Minimum | 0 |
| Maximum | 142 |
| Zeros | 288 |
| Zeros (%) | 1.2% |
| Memory size | 186.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 29 |
| Q1 | 29 |
| median | 62 |
| Q3 | 107 |
| 95-th percentile | 142 |
| Maximum | 142 |
| Range | 142 |
| Interquartile range (IQR) | 78 |
Descriptive statistics
| Standard deviation | 43.2973203 |
|---|---|
| Coefficient of variation (CV) | 0.5957721342 |
| Kurtosis | -1.324908795 |
| Mean | 72.67429577 |
| Median Absolute Deviation (MAD) | 33 |
| Skewness | 0.2455877663 |
| Sum | 1733718 |
| Variance | 1874.657945 |
| Value | Count | Frequency (%) | |
| 29 | 8165 | 34.2% | |
| 93 | 3110 | 13.0% | |
| 142 | 2714 | 11.4% | |
| 62 | 2474 | 10.4% | |
| 80 | 1488 | 6.2% | |
| 130 | 1205 | 5.1% | |
| 107 | 734 | 3.1% | |
| 14 | 657 | 2.8% | |
| 119 | 579 | 2.4% | |
| 103 | 506 | 2.1% | |
| Other values (52) | 2224 | 9.3% |
| Value | Count | Frequency (%) | |
| 0 | 288 | 1.2% | |
| 2 | 1 | < 0.1% | |
| 6 | 119 | 0.5% | |
| 12 | 1 | < 0.1% | |
| 14 | 657 | 2.8% |
| Value | Count | Frequency (%) | |
| 142 | 2714 | 11.4% | |
| 140 | 74 | 0.3% | |
| 139 | 10 | < 0.1% | |
| 138 | 137 | 0.6% | |
| 136 | 66 | 0.3% |
| Distinct count | 28 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 33.46474681421864 |
|---|---|
| Minimum | 0 |
| Maximum | 50 |
| Zeros | 1017 |
| Zeros (%) | 4.3% |
| Memory size | 186.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 23 |
| Q1 | 34 |
| median | 34 |
| Q3 | 34 |
| 95-th percentile | 46 |
| Maximum | 50 |
| Range | 50 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 8.38683369 |
|---|---|
| Coefficient of variation (CV) | 0.2506169772 |
| Kurtosis | 8.7395923 |
| Mean | 33.46474681 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -2.527453789 |
| Sum | 798335 |
| Variance | 70.33897934 |
| Value | Count | Frequency (%) | |
| 34 | 18947 | 79.4% | |
| 43 | 1503 | 6.3% | |
| 0 | 1017 | 4.3% | |
| 46 | 668 | 2.8% | |
| 23 | 642 | 2.7% | |
| 48 | 521 | 2.2% | |
| 36 | 182 | 0.8% | |
| 50 | 145 | 0.6% | |
| 9 | 92 | 0.4% | |
| 39 | 54 | 0.2% | |
| Other values (18) | 85 | 0.4% |
| Value | Count | Frequency (%) | |
| 0 | 1017 | 4.3% | |
| 4 | 4 | < 0.1% | |
| 5 | 1 | < 0.1% | |
| 8 | 1 | < 0.1% | |
| 9 | 92 | 0.4% |
| Value | Count | Frequency (%) | |
| 50 | 145 | 0.6% | |
| 48 | 521 | 2.2% | |
| 46 | 668 | 2.8% | |
| 43 | 1503 | 6.3% | |
| 41 | 6 | < 0.1% |
MULTIPLE_OFFENSE
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 186.4 KiB |
| 1 | |
|---|---|
| 0 | 1068 |
| Value | Count | Frequency (%) | |
| 1 | 22788 | 95.5% | |
| 0 | 1068 | 4.5% |
YEAR
Real number (ℝ≥0)
| Distinct count | 28 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2004.2491197183099 |
|---|---|
| Minimum | 1991 |
| Maximum | 2018 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 186.4 KiB |
Quantile statistics
| Minimum | 1991 |
|---|---|
| 5-th percentile | 1992 |
| Q1 | 1998 |
| median | 2004 |
| Q3 | 2011 |
| 95-th percentile | 2017 |
| Maximum | 2018 |
| Range | 27 |
| Interquartile range (IQR) | 13 |
Descriptive statistics
| Standard deviation | 7.795183852 |
|---|---|
| Coefficient of variation (CV) | 0.003889328814 |
| Kurtosis | -1.117659496 |
| Mean | 2004.24912 |
| Median Absolute Deviation (MAD) | 7 |
| Skewness | 0.1050002008 |
| Sum | 47813367 |
| Variance | 60.76489128 |
| Value | Count | Frequency (%) | |
| 2001 | 1186 | 5.0% | |
| 1996 | 1040 | 4.4% | |
| 2000 | 1016 | 4.3% | |
| 2006 | 989 | 4.1% | |
| 1993 | 962 | 4.0% | |
| 1997 | 952 | 4.0% | |
| 1998 | 947 | 4.0% | |
| 2008 | 941 | 3.9% | |
| 2007 | 903 | 3.8% | |
| 2004 | 897 | 3.8% | |
| Other values (18) | 14023 | 58.8% |
| Value | Count | Frequency (%) | |
| 1991 | 512 | 2.1% | |
| 1992 | 792 | 3.3% | |
| 1993 | 962 | 4.0% | |
| 1994 | 724 | 3.0% | |
| 1995 | 838 | 3.5% |
| Value | Count | Frequency (%) | |
| 2018 | 816 | 3.4% | |
| 2017 | 884 | 3.7% | |
| 2016 | 743 | 3.1% | |
| 2015 | 720 | 3.0% | |
| 2014 | 678 | 2.8% |
MONTH
Real number (ℝ≥0)
| Distinct count | 12 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.50884473507713 |
|---|---|
| Minimum | 1 |
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 186.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 4 |
| median | 7 |
| Q3 | 9 |
| 95-th percentile | 12 |
| Maximum | 12 |
| Range | 11 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 3.29380263 |
|---|---|
| Coefficient of variation (CV) | 0.5060502692 |
| Kurtosis | -1.134943097 |
| Mean | 6.508844735 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | -0.03037886156 |
| Sum | 155275 |
| Variance | 10.84913577 |
| Value | Count | Frequency (%) | |
| 9 | 2290 | 9.6% | |
| 7 | 2157 | 9.0% | |
| 5 | 2142 | 9.0% | |
| 10 | 2138 | 9.0% | |
| 4 | 2114 | 8.9% | |
| 6 | 2113 | 8.9% | |
| 8 | 2110 | 8.8% | |
| 3 | 1990 | 8.3% | |
| 11 | 1853 | 7.8% | |
| 1 | 1739 | 7.3% | |
| Other values (2) | 3210 | 13.5% |
| Value | Count | Frequency (%) | |
| 1 | 1739 | 7.3% | |
| 2 | 1715 | 7.2% | |
| 3 | 1990 | 8.3% | |
| 4 | 2114 | 8.9% | |
| 5 | 2142 | 9.0% |
| Value | Count | Frequency (%) | |
| 12 | 1495 | 6.3% | |
| 11 | 1853 | 7.8% | |
| 10 | 2138 | 9.0% | |
| 9 | 2290 | 9.6% | |
| 8 | 2110 | 8.8% |
DAY
Real number (ℝ≥0)
| Distinct count | 31 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.564637826961771 |
|---|---|
| Minimum | 1 |
| Maximum | 31 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 186.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 8 |
| median | 15 |
| Q3 | 23 |
| 95-th percentile | 29 |
| Maximum | 31 |
| Range | 30 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 8.775545585 |
|---|---|
| Coefficient of variation (CV) | 0.5638130281 |
| Kurtosis | -1.175053951 |
| Mean | 15.56463783 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 0.01334935669 |
| Sum | 371310 |
| Variance | 77.01020031 |
| Value | Count | Frequency (%) | |
| 1 | 946 | 4.0% | |
| 7 | 847 | 3.6% | |
| 15 | 840 | 3.5% | |
| 13 | 836 | 3.5% | |
| 12 | 823 | 3.4% | |
| 21 | 822 | 3.4% | |
| 18 | 817 | 3.4% | |
| 23 | 799 | 3.3% | |
| 10 | 797 | 3.3% | |
| 24 | 796 | 3.3% | |
| Other values (21) | 15533 | 65.1% |
| Value | Count | Frequency (%) | |
| 1 | 946 | 4.0% | |
| 2 | 792 | 3.3% | |
| 3 | 762 | 3.2% | |
| 4 | 732 | 3.1% | |
| 5 | 742 | 3.1% |
| Value | Count | Frequency (%) | |
| 31 | 393 | 1.6% | |
| 30 | 733 | 3.1% | |
| 29 | 703 | 2.9% | |
| 28 | 731 | 3.1% | |
| 27 | 766 | 3.2% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| INCIDENT_ID | X_1 | X_2 | X_4 | X_5 | X_6 | X_7 | X_8 | X_9 | X_10 | X_11 | X_13 | X_14 | X_15 | MULTIPLE_OFFENSE | YEAR | MONTH | DAY | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | CR_102659 | 0 | 36 | 2 | 1 | 5 | 6 | 1 | 6 | 1 | 174 | 92 | 29 | 36 | 0 | 2004 | 7 | 4 |
| 1 | CR_189752 | 1 | 37 | 0 | 0 | 11 | 17 | 1 | 6 | 1 | 236 | 103 | 142 | 34 | 1 | 2017 | 7 | 18 |
| 2 | CR_184637 | 0 | 3 | 3 | 5 | 1 | 0 | 2 | 3 | 1 | 174 | 110 | 93 | 34 | 1 | 2017 | 3 | 15 |
| 3 | CR_139071 | 0 | 33 | 2 | 1 | 7 | 1 | 1 | 6 | 1 | 249 | 72 | 29 | 34 | 1 | 2009 | 2 | 13 |
| 4 | CR_109335 | 0 | 33 | 2 | 1 | 8 | 3 | 0 | 5 | 1 | 174 | 112 | 29 | 43 | 1 | 2005 | 4 | 13 |
| 5 | CR_96263 | 0 | 45 | 10 | 3 | 1 | 0 | 1 | 6 | 1 | 303 | 72 | 62 | 34 | 1 | 2003 | 4 | 7 |
| 6 | CR_131400 | 0 | 30 | 7 | 3 | 7 | 1 | 0 | 5 | 1 | 174 | 112 | 29 | 43 | 1 | 2008 | 1 | 22 |
| 7 | CR_11981 | 0 | 8 | 7 | 3 | 9 | 8 | 0 | 5 | 1 | 316 | 72 | 62 | 34 | 1 | 1993 | 5 | 14 |
| 8 | CR_184134 | 0 | 49 | 6 | 5 | 8 | 3 | 1 | 1 | 1 | 316 | 103 | 14 | 34 | 1 | 2016 | 8 | 21 |
| 9 | CR_32634 | 1 | 4 | 6 | 5 | 15 | 10 | 0 | 5 | 2 | 145 | 103 | 29 | 34 | 0 | 1996 | 8 | 25 |
Last rows
| INCIDENT_ID | X_1 | X_2 | X_4 | X_5 | X_6 | X_7 | X_8 | X_9 | X_10 | X_11 | X_13 | X_14 | X_15 | MULTIPLE_OFFENSE | YEAR | MONTH | DAY | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 23846 | CR_79724 | 1 | 36 | 2 | 1 | 15 | 10 | 0 | 5 | 1 | 249 | 92 | 130 | 34 | 1 | 2001 | 9 | 17 |
| 23847 | CR_38033 | 0 | 33 | 2 | 1 | 5 | 6 | 1 | 6 | 2 | 249 | 103 | 93 | 34 | 1 | 1996 | 5 | 2 |
| 23848 | CR_14384 | 0 | 26 | 9 | 0 | 3 | 5 | 0 | 5 | 1 | 249 | 112 | 130 | 34 | 1 | 1993 | 12 | 2 |
| 23849 | CR_68953 | 7 | 25 | 9 | 0 | 9 | 8 | 0 | 5 | 1 | 249 | 72 | 93 | 34 | 1 | 2000 | 4 | 25 |
| 23850 | CR_33201 | 0 | 4 | 6 | 5 | 1 | 0 | 2 | 6 | 1 | 0 | 72 | 29 | 34 | 1 | 1996 | 7 | 11 |
| 23851 | CR_88991 | 1 | 47 | 7 | 3 | 15 | 10 | 1 | 5 | 1 | 174 | 98 | 29 | 34 | 1 | 2002 | 1 | 11 |
| 23852 | CR_46369 | 0 | 33 | 2 | 1 | 5 | 6 | 0 | 5 | 1 | 174 | 112 | 29 | 43 | 1 | 1997 | 2 | 5 |
| 23853 | CR_157556 | 0 | 25 | 9 | 0 | 3 | 5 | 1 | 6 | 1 | 174 | 10 | 29 | 18 | 1 | 2012 | 4 | 3 |
| 23854 | CR_103180 | 0 | 39 | 6 | 5 | 2 | 7 | 1 | 6 | 1 | 127 | 112 | 103 | 43 | 1 | 2004 | 1 | 25 |
| 23855 | CR_22575 | 7 | 36 | 2 | 1 | 9 | 8 | 0 | 5 | 1 | 249 | 92 | 29 | 34 | 1 | 1994 | 11 | 8 |